Introducing a web application for labeling, visualizing speech and correcting derived speech signals

نویسندگان

  • Raphael Winkelmann
  • Georg Raess
چکیده

The advent of HTML5 has sparked a great increase in interest in the web as a development platform for a variety of different research applications. Due to its ability to easily deploy software to remote clients and the recent development of standardized browser APIs, we argue that the browser has become a good platform to develop a speech labeling tool for. This paper introduces a preliminary version of an open-source client-side web application for labeling speech data, visualizing speech and segmentation information and manually correcting derived speech signals such as formant trajectories. The user interface has been designed to be as user-friendly as possible in order to make the sometimes tedious task of transcribing as easy and efficient as possible. The future integration into the next iteration of the EMU speech database management system and its general architecture will also be outlined, as the work presented here is only one of several components contributing to the future system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparative Study of Gender and Age Classification in Speech Signals

Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...

متن کامل

A Novel Frequency Domain Linearly Constrained Minimum Variance Filter for Speech Enhancement

A reliable speech enhancement method is important for speech applications as a pre-processing step to improve their overall performance. In this paper, we propose a novel frequency domain method for single channel speech enhancement. Conventional frequency domain methods usually neglect the correlation between neighboring time-frequency components of the signals. In the proposed method, we take...

متن کامل

Efficient Correction Interfaces for Speech Recognition

The recognition of speech by computers is a challenging task and recognition errors are ultimately unavoidable. Error correction is thus a crucial part of any speech recognition interface. In this thesis, I look at how to improve the correction process in speech recognition. Before errors can be corrected, they must first be detected. I look at improving error detection by visualizing the recog...

متن کامل

P65: Speech Recognition Based on Bbrain Signals by the Quantum Support Vector Machine for Inflammatory Patient ALS

People communicate with each other by exchanging verbal and visual expressions. However, paralyzed patients with various neurological diseases such as amyotrophic lateral sclerosis and cerebral ischemia have difficulties in daily communications because they cannot control their body voluntarily. In this context, brain-computer interface (BCI) has been studied as a tool of communication for thes...

متن کامل

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014